Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines
نویسنده
چکیده
This paper looks at a combination of two techniques, one of which, using a vector instruction set, has a long history dating back to pipelined vector supercomputers, such as the Cray 1 and its successors. The other technique, multi-threading, is also well understood. The novel approach proposed in this paper combines both vertical and horizontal micro-threading with vector instruction descriptors. It will be shown that a family of threads can represent a vector instruction with dependencies between the instances of that family, the iterations. This technique gives a very low overhead in implementing an n-way loop and is able to tolerate high memory latency. The use of micro-threading to handle dependencies between threads provides the ability to trade-off between instruction level parallelism and loop parallelism. The paper describes the means by which instruction classes may be instanced as independent parallel micro-threads and illustrates the speed-up that may be obtained compared to using a conventional loop.
منابع مشابه
Scalable and Partitionable Asynchronous Arbiter for Micro-threaded Chip Multiprocessors
This paper presents a scalable and partitionable asynchronous bus arbiter for use with chip multiprocessors (CMP) and its corresponding pre-layout simulation results using VHDL. The arbiter exploits the advantage of a concurrency control instruction (Brk) provided by the micro-threaded microprocessor model to set the priority processor and move the circulated arbitration token at the most likel...
متن کاملAsynchronous arbiter for micro-threaded chip multiprocessors
Abstract This paper presents a scalable and partitionable asynchronous bus arbiter for use with chip multiprocessors (CMP) and its corresponding pre-layout simulation results using VHDL. The arbiter exploits the advantage of a concurrency control instruction (Brk) provided by the micro-threaded microprocessor model to set the priority processor and move the circulated arbitration token at the m...
متن کاملPerformance of a Micro-threaded Pipeline
The micro-threaded microprocessor is a chip multi-processor, which uses a multi-threaded approach, where the threads are obtained from within a single context and exploit both vector and instruction level parallelism (ILP). This approach employs vertical and horizontal transfer in a simple pipeline. The horizontal transfer is referred to as the normal scalar pipeline processing used in most mic...
متن کاملThe Effect of Cache on the Performance of a Multi-Threaded Pipelined RISC Processor
This paper examines the effects of multithreaded pipelining on the CPI (cycles per instruction) of a RISC processor. The desired CPI in a conventional (single-threaded) RISC processor is one instruction per cycle. However, the CPI is typically more than one because of data hazards, control hazards, and resource hazards in the pipeline. A multi-threaded processor performs a context switch betwee...
متن کاملIntegrating Parallelizing Compilation Technology and Processor Architecture for Cost-Effective Concurrent multithreading
As the number of transistors on a single chip continues to grow, it is important to think beyond the traditional approaches of compiler optimizations for deeper pipelines and wider instruction issue units to improve performance. This single-threaded execution model limits these approaches to exploiting only the relatively small amount of instruction-level parallelism available in application pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001